Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing

Identifieur interne : 000175 ( Main/Exploration ); précédent : 000174; suivant : 000176

A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing

Auteurs : Meinard Müller [Allemagne] ; NANZHU JIANG [Allemagne] ; Peter Grosche [Allemagne]

Source :

RBID : Pascal:13-0149238

Descripteurs français

English descriptors

Abstract

The automatic extraction of structural information from music recordings constitutes a central research topic. In this paper, we deal with a subproblem of audio structure analysis called audio thumbnailing with the goal to determine the audio segment that best represents a given music recording. Typically, such a segment has many (approximate) repetitions covering large parts of the recording. As the main technical contribution, we introduce a novel fitness measure that assigns a fitness value to each segment that expresses how much and how well the segment "explains" the repetitive structure of the entire recording. The thumbnail is then defined to be the fitness-maximizing segment. To compute the fitness measure, we describe an optimization scheme that jointly performs two error-prone steps, path extraction and grouping, which are usually performed successively. As a result, our approach is even able to cope with strong musical and acoustic variations that may occur within and across related segments. As a further contribution, we introduce the concept of fitness scape plots that reveal global structural properties of an entire recording. Finally, to show the robustness and practicability of our thumbnailing approach, we present various experiments based on different audio collections that comprise popular music, classical music, and folk song field recordings.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing</title>
<author>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>International Audio Laboratories Erlangen, which is a joint institution of the University of Erlangen-Nuremberg and Fraunhofer IIS</s1>
<s2>91058 Erlangen</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>91058 Erlangen</wicri:noRegion>
<placeName>
<settlement type="city">Erlangen</settlement>
<region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Moyenne-Franconie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Nanzhu Jiang" sort="Nanzhu Jiang" uniqKey="Nanzhu Jiang" last="Nanzhu Jiang">NANZHU JIANG</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Saarland University and the Max-Planck Institut fur Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Saarland University and the Max-Planck Institut fur Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">13-0149238</idno>
<date when="2013">2013</date>
<idno type="stanalyst">PASCAL 13-0149238 INIST</idno>
<idno type="RBID">Pascal:13-0149238</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000004</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000010</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000003</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000003</idno>
<idno type="wicri:doubleKey">1558-7916:2013:Muller M:a:robust:fitness</idno>
<idno type="wicri:Area/Main/Merge">000175</idno>
<idno type="wicri:Area/Main/Curation">000175</idno>
<idno type="wicri:Area/Main/Exploration">000175</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing</title>
<author>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>International Audio Laboratories Erlangen, which is a joint institution of the University of Erlangen-Nuremberg and Fraunhofer IIS</s1>
<s2>91058 Erlangen</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>91058 Erlangen</wicri:noRegion>
<placeName>
<settlement type="city">Erlangen</settlement>
<region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Moyenne-Franconie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Nanzhu Jiang" sort="Nanzhu Jiang" uniqKey="Nanzhu Jiang" last="Nanzhu Jiang">NANZHU JIANG</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Saarland University and the Max-Planck Institut fur Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Saarland University and the Max-Planck Institut fur Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">IEEE transactions on audio, speech, and language processing</title>
<title level="j" type="abbreviated">IEEE trans. audio speech lang. process.</title>
<idno type="ISSN">1558-7916</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">IEEE transactions on audio, speech, and language processing</title>
<title level="j" type="abbreviated">IEEE trans. audio speech lang. process.</title>
<idno type="ISSN">1558-7916</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Automatic recognition</term>
<term>Feature extraction</term>
<term>Information extraction</term>
<term>Musical acoustics</term>
<term>Optimization</term>
<term>Periodic structure</term>
<term>Robustness</term>
<term>Signal processing</term>
<term>Sound analysis</term>
<term>Structural analysis</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance automatique</term>
<term>Extraction caractéristique</term>
<term>Extraction information</term>
<term>Analyse son</term>
<term>Analyse structurale</term>
<term>Structure périodique</term>
<term>Optimisation</term>
<term>Acoustique musicale</term>
<term>Robustesse</term>
<term>Traitement signal</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The automatic extraction of structural information from music recordings constitutes a central research topic. In this paper, we deal with a subproblem of audio structure analysis called audio thumbnailing with the goal to determine the audio segment that best represents a given music recording. Typically, such a segment has many (approximate) repetitions covering large parts of the recording. As the main technical contribution, we introduce a novel fitness measure that assigns a fitness value to each segment that expresses how much and how well the segment "explains" the repetitive structure of the entire recording. The thumbnail is then defined to be the fitness-maximizing segment. To compute the fitness measure, we describe an optimization scheme that jointly performs two error-prone steps, path extraction and grouping, which are usually performed successively. As a result, our approach is even able to cope with strong musical and acoustic variations that may occur within and across related segments. As a further contribution, we introduce the concept of fitness scape plots that reveal global structural properties of an entire recording. Finally, to show the robustness and practicability of our thumbnailing approach, we present various experiments based on different audio collections that comprise popular music, classical music, and folk song field recordings.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>Bavière</li>
<li>District de Moyenne-Franconie</li>
<li>Sarre (Land)</li>
</region>
<settlement>
<li>Erlangen</li>
<li>Sarrebruck</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Bavière">
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
</region>
<name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
<name sortKey="Nanzhu Jiang" sort="Nanzhu Jiang" uniqKey="Nanzhu Jiang" last="Nanzhu Jiang">NANZHU JIANG</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000175 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000175 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:13-0149238
   |texte=   A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024